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Abstract 

There are games with a unique Nash equilibrium but such that, 
for almost all initial conditions, all strategies in the support of this 
equilibrium are eliminated by the replicator dynamics and the best- 
reply dynamics. 
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1 Introduction 

Evolutionary game dynamics model the evolution of the mean behavior in pop- 
ulations of agents interacting strategically. A most studied topic is the link 
between the outcome of these dynamics and Nash equilibria. Many positive 
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connections have been found, including convergence to the set of Nash equilib- 
ria for many dynamics in special classes of games (Sandholm, 2010). In general 
though, solutions of evolutionary game dynamics need not converge to the set 
of Nash equilibria (Hofbauer and Sigmund, 1998, section 8.6). By contrast 
with no-regret dynamics (e.g., Hart, 2005 and references therein), replacing 
Nash equilibria by correlated equilibria and convergence of the solutions by 
convergence of some time-average hardly helps: for many dynamics, there are 
examples of games with a unique Nash equilibrium, which is also the unique 
correlated equilibrium, but such that, for some initial conditions, all strategies 
in the support of this equilibrium are eliminated (Viossat, 2007, 2008). 

In these examples however, the Nash equilibrium is strict and thus asymp- 
totically stable under reasonable dynamics. This leads to the following ques- 
tion: are there games such that all strategies in the support of Nash equilibria 
are eliminated for almost all initial conditions'! This article shows that the 
answer is positive, at least for the two most studied dynamics: the replicator 
dynamics (REP) and the best-reply dynamics (BR). For BR, we exhibit an 
open set of such games. 

Our examples are relatively high dimensional: 6x6 games for BR and 
7 x 7 for REP. The reason why we need an extra-dimension for the replicator 
dynamics seems purely technical: our examples for the best-reply dynamics 
should work as well for the replicator dynamics, but this is not so easy to 
prove, as the replicator dynamics is more difficult to analyze than the best- 
reply dynamics. 

The reason why our games are relatively high dimensional is deeper: first, 
by the folk-theorem of evolutionary game theory (Weibull, 1995, Prop. 4.11), 
if an interior trajectory of REP or BR converges to a point, then this point is a 
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Nash equilibrium. Thus, we need nonconvergent trajectories, and along which, 
asymptotically, only strategies that do not belong to the support of a Nash 
equilibrium have positive probability. For single population dynamics, this 
seems to require at least three strategies not in the support of Nash equilibria. 
Moreover, the only solution for having a unique strategy in the support of at 
least one Nash equilibrium is to have a unique, pure Nash equilibrium. But 
such a Nash equilibrium would be strict, hence asymptotically stable: 

Proposition 1.1. In a bimatrix game, a unique and pure Nash equilibrium is 
strict. 

Proof. A Nash equilibrium is quasi-strict if each player puts positive weight 
on each of her pure best-replies. In a bimatrix game, if a Nash equilibrium is 
unique, then it is quasi-strict (Jansen, 1981; Norde, 1999); if it is unique and 
pure, it is quasi-strict and pure, hence strict. □ 

We thus need at least two strategies in the support of Nash equilibria. With 
the three strategies not in the support of equilibria, this makes at least five 
strategies. Our examples for the best-reply dynamics are 6x6 games: there 
might be room for improvement, but not much. 

The remainder of this article is organized as follows: the framework and 
the notation are introduced below. Section [2] studies the behavior of the best- 
reply dynamics in a family of 6 x 6 games. Section [3] studies the replicator 
dynamics in a specific 7x7 game. Section H] concludes. Appendix |A] shows 
that the games we study have a unique Nash equilibrium. Appendix IB1 studies 
the behavior of the best-reply dynamics in the 7x7 game of Section |3j 

Notation and definitions. We study single-population dynamics in two-player, 
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finite symmetric games. The set of pure strategies is I = {1, 2, .., N} and the 
payoff matrix is U = {uij)i<ij<N- Thus, Uy is the payoff of an individual 
playing strategy i against an individual playing strategy j. Let Sn denote the 
simplex of mixed strategies (henceforth, "the simplex"): 



Its vertices ej, 1 < % < N, correspond to the pure strategies of the game. Note 
that vectors and matrices are denoted by bold characters. 

Denote by Xi(t) the proportion of the population playing strategy i at time t 
and by x(t) = (xi(t), Xjsr(t)) G Sn the population profile (or mean strategy). 
We often omit time arguments and write x for x(t). We study the evolution 
of the population profile under the two most studied dynamics: the replicator 
dynamics and the best-reply dynamics. 

The replicator dynamics (Taylor and Jonker, 1978) may be derived by 
assuming that the per capita growth rate of the total number of individuals 
playing strategy i is the payoff of the garnet For frequencies of strategies, this 
leads to: 



The right-hand side is Lipschitz in x, hence there is a unique solution 
through each initial condition. This solution is interior if Xi(t) > for all 
i G / and all t G R. Since the faces of the simplex are invariant under (IREPj) . 
this boils down to the initial condition being interior; that is, Xj(0) > for all 
i in I. 

The best-reply dynamics (Gilboa and Matsui, 1991; Matsui, 1992) may 
be derived by assuming that in each small time interval, a fraction of the 
Or, up to a change of time, a background fitness plus the payoff of the game. 




Xi = Xi [(Ux)j - X ■ Ux] 



(REP) 
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population revises its strategy and switches (rationally, but myopically) to a 
best-reply to the current population profile. Since this best-reply need not 
be unique, this does not lead to a differential equation but to the differential 

inclusion: 



where BR(x) = {y G Sn : y • Ux = max ze s N z ■ Ux} denotes the set of 
mixed best-replies to x. A solution of the best-reply dynamics is an absolutely 
continuous function x : R + — > Sn satisfying (1BRI) for almost every t. Solutions 
exist for each initial condition, but need not be unique. 

Other definitions. The limit set of a solution x(-) of a given dynamics is 
the set of accumulation points of x(t) as t — > +00. A pure strategy i belongs 
to the support of a Nash equilibrium of a symmetric bimatrix game if there 
is a Nash equilibrium (x, y) such that X{ > (or equivalently, due to the 
symmetry of the game, a Nash equilibrium (x, y) such that > 0). Finally, 
the pure strategy i is eliminated (for a given solution x(-) of a given dynamics) 
if Xi(t) — > as t — > +00. 

We show that there are games with a unique Nash equilibrium but such 
that, under the best-reply dynamics and the replicator dynamics, all strate- 
gies in the support of this equilibrium are eliminated for almost all initial 
conditions. 




(BR) 
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2 Best-reply dynamics 



2.1 A reminder on Rock-Paper-Scissors 

A general Rock-Paper-Scissors game (RPS) is a 3 x 3 symmetric game with 
payoff matrix 



' oi h c 3 \ 



with b{ < cii < Ci, i 



1,2,3. 



(1) 



ci a 2 b 3 
\ h c 2 a 3 J 

(As the game is symmetric, we only indicate the payoffs of the row player.) 
These games have a unique Nash equilibrium. It is symmetric and completely 
mixed. We say that the game is outward cycling if 



n< 



b;) > Y[(ci 

i=i 



(2) 



In that case, almost all solutions of the best-reply dynamics converge to a 
triangle, which Gaunersdorfer and Hofbauer (1995) called the Shapley triangle 
after Shapley (1964). It is defined by 



ST = {x G S 3 : V(x) = 0} with V(x) = max^Ux), - ^ 



(3) 



Kt<3 



Proposition 2.1 (Gaunersdorfer and Hofbauer, 1995). In an outward cycling 
RPS game, for every initial condition different from the equilibrium, the so- 
lution of the best-replu dynamics is uniquely defined and its limit set is the 
Shapley triangle c 



2 If rii=i( a « ~ &») = lli=i( c i — a i)i e -S-> ^ the game is zero-sum, the Shapley triangle is 
degenerate and coincides with the equilibrium; if n»=i ( a i ~h) < IIi=i ( c « — a *)i the Shapley 
triangle is empty. In both cases, all solutions of the best-reply dynamics converge to the 
equilibrium. 
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A RPS game has cyclic symmetry if the payoffs Oj^ i ^2 5 ^2 tire independent 
of i. The Nash equilibrium is then (1/3, 1/3, 1/3) and, up to a rescaling that 
does not affect the equilibrium nor the dynamics we study, the payoffs may be 
taken of the form: 

/ n -n, r \ 

with a > 0, > 0. (4) 

y —a p u j 

The outward cycling condition (j2J) then boils down to a > (3, and the Shapley 
triangle (J3J) to 






—a 


P 







—a 


—a 








5T= xGS 3 : max(Ux) 4 = 0^. (5) 
I i • / ■'> J 

We now describe in detail the behavior of the best-reply dynamics in RPS 

games, and give a sketch of proof of Proposition 12. 1[ as this allows to introduce 

some crucial tools. The first one is a version of the improvement principle 

(Monderer and Sela, 1997). It says that when the solution of the best-reply 

dynamics points towards a pure best-reply i, only certain strategies can become 

best-replies: those that are better replies to i than % itself. 

Lemma 2.2 (Improvement principle). Let x(-) be a solution of the best-reply 
dynamics. Assume that on the interval }T,T'[, with T < T' , the unique best- 
reply to x(£) is strategy i. If strategy j ^ i is a best-reply to x(T') then 

U ji ^ Ua . 

Proof. See Viossat (2008, Lemma 4.2). □ 

Assume for instance that in a RPS game, strategy 1 is currently the unique 
best- reply to the population profile x(t), so that the solution points towards 
that is, x = ei — x. Since (ei,ei) is not a Nash equilibrium, a new best-reply 
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must arise. By the improvement principle, this can only be strategy 2. The 
solution then points towards the edge ei — e 2 . Since in the game restricted 
to strategies 1 and 2, strategy 2 strictly dominates strategy 1, strategy 2 
immediately becomes the unique best-reply. Therefore the solution points 
towards e 2 , then towards e 3 , then towards ei again,... 

By itself, this cyclic behaviour does not preclude convergence to equilib- 
rium. Actually, if Yl^=i( a i~^i) < l~IiLi( c «~ a i) then the solutions cycle inwards, 
and the times at which their direction change accumulate as x(£) converges, 
in finite time, to the equilibrium (Gaunersdorfer and Hofbauer, 1995). 

However, for outward cycling RPS games and for solutions that do not 
start at the equilibrium, this cyclic behavior goes on forever. This follows 
from the following observations, which we do not prove. Below, the function 
V is defined in © and v(t) = V(x(t)). 

(i) If the game is outward cycling, then V(x) is zero on the Shapley triangle, 
positive outside it, and negative inside, with its unique minimum attained at 
the equilibrium point. 

(ii) When the solution points towards a pure strategy (that is, x = — x 
for some i), then v(t) = —v(t). 

Consider a solution that does not start at the equilibrium. Combining (i), 
(ii), and the above described cyclic behavior, we get that the solution cannot 
approach the equilibrium, therefore the times at which the direction changes 
cannot accumulate and the cyclic behavior goes on for ever; thus, by (ii), 
v(t) — > hence x(i) — > ST. The limit set of the solution is then easily seen to 
be the whole triangle. 
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2.2 A 6 x 6 game 

Consider the following 6x6 symmetric game: 
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(6) 
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Let G123 and G456 denote the 3x3 games obtained from (jSj) by restricting the 
players to their three first and to their three last strategies, respectively. Both 
G123 and G456 are outward cycling RPS games with cyclic symmetry. Their 
unique Nash equilibrium correspond in the whole game to, respectively: 

1 1 1 



n 123 



. . ,0,0,0 
3 3 3 



and n 456 



0,0,0,-,-,- 

' ' '3 3 3 



The payoffs are chosen so that (11123,11123) be a Nash equilibrium of (jBJ) but 

not (11456,11456 ). 

Proposition 2.3. The game (G|) has a unique Nash equilibrium: (11123,11123). 
Proof. See Appendix A. □ 



Proposition 12.31 does not only state that (11123, ^23) is the unique symmet- 
ric Nash equilibrium, but also that there are no asymmetric Nash equilibria. 
Nevertheless, from almost all initial conditions, all strategies in its support are 
eliminated. More precisely, let ST 456 denote the Shapley triangle: 



5^456 — <! x : x 4 + x 5 + x 6 = 1 and max (UxL = 

4<j<6 



(7) 
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Proposition 2.4. For almost every mixed strategy x in Sq, there is a unique 
solution x(-) of WR\) such that x(0) = x ; and its limit set is the Shapley 
triangle ST^q. 

Proof. The proof relies on the improvement principle and the better-reply 
structure of the game, described in Fig. 1 below: 



1 6 




Figure 1: Better-replies to pure strategies in game ©. An arrow from i to j means 
that Uij > uu. 

Consider a solution x(-) of the best-reply dynamics. We may assume that 
there is a unique best-reply to x = x(0), since this holds for almost all x in 
Sq. There are then two cases. 

Case 1: the unique best-reply to x(0) is strategy 4, 5 or 6. Assume for 
concreteness that this is strategy 4. The improvement principle (Lemma 12.21) 
and the same reasoning as for RPS games imply that the solution first points 
towards e«4, then towards e$, then towards e$, then towards again, in a cyclic 
fashion. It may be shown exactly as in (Viossat, 2008, p. 33) that the times 
at which the direction of the solution changes do not accumulated It follows 
3 The idea is to show that the function W(x) = max4< i ^j< 6 [(Ux) i — (Ux)j] is bounded 
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that this cyclic behavior goes on for ever. Therefore, strategies 1, 2 and 3 
never become best-replies, hence Xi(t) = Xj(0)e _t — > for all i in {1,2,3}. 
Moreover, when strategy % G {4, 5, 6} is the unique best-reply, the function 
v(t) = max 4 <j< 6 (Ux)j(t) is equal to v(t) = (Ux)j(t) = • Ux(t) and satisfies: 



Therefore v(t) — > 0, hence x(t) — > 5 , T 456 as t — > +oo. 

Case 2: the unique best-reply to x(0) is strategy 1, 2 or 3. Assume for 
concreteness that this is strategy 4. If none of the strategies 4, 5 and 6 ever 
becomes a best-reply, the solution points towards e 1 , then towards e 2 , then 
towards e 3 , etc., and due to the same reasoning as in case 1, its limit set will 
be the Shapley triangle 



This is impossible, because the payoffs are such that at one of the vertices of 
this triangle, the closest to e 3 , strategy 4 is the unique best-reply. This vertex 
is given by ^(1, 3, 9, 0, 0, 0), see Gaunersdofer and Hofbauer (1995, Eq. (3.6)). 

Thus, there exists a first time T > at which one of the strategies 4, 5 
and 6 becomes a best-reply. Due to the improvement principle and to the 
better-reply structure of the game (Fig. 1), this can only be strategy 4, and 
just before T, the unique best-reply was strategy 3. 
There are then two subcases: 

Subcase 2.1: the pure best-replies at time T are strategies 1, 3 and 4 (that 
is, strategies 1 and 4 become best-replies at the same time). The dynamics then 
admits several solutions and becomes more difficult to analyze. Fortunately, 
away from zero. 



v — e,- • Ux 



ej ■ U(ej — x) = — ej • Ux = — v 



(8) 
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the solution can be precisely traced back in time (in backward time, starting 
from T, it moves away from e% along a straight line, then away from e2,...). It 
follows that the set of initial conditions for which this case occurs is contained 
in the intersection of the simplex with a countable union of hyperplanes of Mr, 
none of which contains the simplex. Therefore, this set has Lebesgue measure 
zero (with respect to the simplex) and we can neglect this case. 

Subcase 2.2: the pure best-replies at time T are strategies 3 and 4. The 
solution will then point towards the edge — e±. Since in the game reduced 
to strategies 3 and 4, strategy 4 strictly dominates strategy 3, it follows that 
strategy 4 becomes the unique best-reply and we are back to case 1. □ 

Robustness to perturbations of the payoffs. The above proof uses only strict 
inequalities, which are unaffected by sufficiently small perturbations of the 
payoffs (the only modification is that the Shapley triangles and the underlying 
functions V must be defined as in ([3]) because the diagonal terms need no 
longer be zero). Moreover, since the game is a bimatrix game with a unique 
Nash equilibrium, it follows that any game in its neighborhood has a unique 
Nash equilibrium, and with the same support (Jansen, 1981). Therefore: 

Proposition 2.5. There exists a neighborhood of game (TJP such that, for any 
symmetric game in this neighborhood, the unique Nash equilibrium has support 
in {1,2,3} x {1,2,3}, but for almost all initial conditions, strategies 1, 2 and 
3 are eliminated by the best-reply dynamics. 
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3 Replicator Dynamics 

Up to a further rescaling, the payoff matrix of an outward cycling RPS game 
with cyclic symmetry (j4]) may be taken of the form: 

/ -1 s \ 

e -1 with < e < 1 (9) 
V -l e ) 

The behavior of the replicator dynamics in such games is well known. The 
boundary r = {x G 5 3 : X1X2X3 = 0} forms a heteroclinic cycle, that is, a 
globally invariant set consisting of saddle rest-points and saddle orbits con- 
necting these rest-points. Moreover: 

Proposition 3.1. [Zeeman, 1980; Gaunersdorfer and Hofbauer, 1995] In 
game (Q, the set T is asymptotically stable, all interior solutions that do not 
start at the equilibrium (1/3, 1/3, 1/3) converge to T and the limit set of their 
time-average is the Shapley triangle (EJ). 

(If y(-) is a solution of (IREPp . its time-average at t 7^ is \ / *y(s) ds.) 

Two other facts will prove useful: first, in game (j9]), the mean payoff is 
always nonpositive: 

Lemma 3.2. Let U denote the payoff matrix (TJJ).- Vx £ S3, x • Ux < 

Proof. A standard computation shows that x • Ux = ^~ 1 2 +£ - ) [l — Y^=i X T\ 
which is nonnegative since x G S3. □ 

Second, as computed by Gaunersdorfer and Hofbauer (1995, Eq. (3.6)), 
the vertex of the Shapley triangle closest to e3 is given by 

13 



Consider a solution y(-) of the replicator dynamics that does not start at 
the equilibrium. Proposition 13.11 implies that q is an accumulation point of 
the time-average of y{t). Moreover, for e small enough (e < 1/4 suffices), 
4g 3 — 3 > 0; this implies the following result: 

fill 



limsup / (4t/ 3 (s) — 3)ds = +oo 
Jo 

Now consider the following 7x7 symmetric game: 
/ 
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/ 

with £ > small enough^ The games obtained by restricting both players to 
their three first or to their three last strategies are outward cycling Rock-Paper- 
Scissors games with cyclic symmetry. The Nash equilibria of these games 
correspond in the whole game to rest points of the replicator dynamics, which 
we denote by ni 2 3 and n 567 : 
1 1 1 



n 123 



3,3,3,0,0,0,0 



11567 



o,o,o,o,i,i,i 



The heteroclinic cycles of the RPS games correspond to heteroclinic cycles of 
the whole game, which we denote by and T567: 



123 



{x G 1S7 : X\ + X2 + £3 = 1 and £1X2X3 = 0}; 



4 In the proofs, for simplicity, we use e < 1/48, but the results extend easily to e < 1/6, 
and probably beyond. 
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r 



567 — 



{x G S 7 : x 5 + Xq + x 7 = 1 and x 5 x 6 x 7 = 0}. 




Proposition 3.3. (n 12 3,n 12 3) is the unique Nash equilibrium of game ([12] 



In spite of Proposition 13.31 the heteroclinic cycle is not asymptotically 
stable. Indeed, at e 3 , the unique best-reply is strategy 4. By contrast, though 
(n 56 7,n 56 7) is not an equilibrium of ffl~2]) : 

Proposition 3.4. The heteroclinic cycle r 567 is asymptotically stable. 

Proof. r 567 is asymptotically stable on the face spanned by e 5 , e 6 , e 7 due to 
Proposition 13.11 Moreover, near the vertices e 5 to e 7 , the payoffs of strategies 
1 to 4 are less than the mean payoff, hence the shares of strategies 4 to 7 
decrease. Then apply Thm. 17.5.1 of Hofbauer and Sigmund (1998). □ 

Thus, if a solution of the replicator dynamics approaches r 567 arbitrarily 
closely, then it converges to it. We will show that this occurs for almost all 
initial conditions. Together with Proposition I3.3[ this implies that for almost 
all initial conditions, all pure strategies in the support of the unique equilibrium 
of game ( |12p are eliminated. 

Roughly, if the solution starts close to the equilibrium, then it first spirals 
towards the heteroclinic cycle Eventually, it spends enough time close to 
e 3 , where the unique best- reply is strategy 4, for x 4 to increase substantially. 
Since strategies 5, 6, and 7 have very good payoffs again strategy 4, this triggers 
a subsequent increase in x$, x% and x 7 . The solution then cycles towards T567- 
However, x^ then decreases, which may lead to a come-back of strategies 1, 
5 There are no asymmetric Nash equilibria. 



Proof. See Appendix A 



□ 
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2, 3, and the whole process might start again. The difficulty is to make sure 
that, each time this process runs, the solution gets closer to r 567 . 

For the replicator dynamic, this can be shown due to the last important 
property of game (Tl2|) : against strategies 4 to 7, strategies 1 to 3 have the same 
payoffs. That is, for any i, i' in {1,2,3} and any j in {4,5,6,7}, = u^j. 
Similarly, against strategies 1 to 4, strategies 5 to 7 have the same payoffs. 
Due to linearity properties of the replicator dynamics, this implies that the 
dynamics may be decomposed as we now explain. 

Let x(-) be an interior solution of the replicator dynamics. For each % in 
{1,2,3}, define Xi(t) as the share of strategy i at time t relative to the total 
share of strategies 1, 2 and 3: 

Xi := *\ (13) 
Xi + x 2 + x 3 

and let x = (xi, x 2 , x 3 ). For i e {5, 6, 7}, define similarly: 

Xi := - (14) 

X5 + Xq + X 7 

and let x = (£5, xq, 27). Finally, let 

A(t) = xi(t) + x 2 (t) + x 3 (t) and /i(t) = x 5 (t) + x 6 (t) + x 7 (t) 

denote respectively the total share of the three first and of the three last 
strategies at time t. The evolution of x is fully described by the joint evolution 
of x, x, A and fi. The interest of this description is that, up to a change in 
velocity, x and x follow the replicator dynamics of the Rock-Paper-Scissors 
game ([HD- 

Formally, let f(t) denote the rescaled time 

f{t) := [ X(s)ds (15) 
Jo 
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Let both U and U denote the payoff matrix depending on whether it arises 
as the top-left or the bottom-right corner of game 

Lemma 3.5. Let y(-) denote the solution of ^REP\) in the RPS game ((9]) ; with 
initial condition y(0) = x(0). We have: 



Xxi [(Ux)i - x • Ux] Vz = l,2,3 (16) 
Vt G E, x(t) = y(f(t)) (17) 

Proof. The proof of (Tl6|) is the same as the proof of Lemma 5.2 of Viossat 
(2007). Due to ([IB"]) . y(r(t)) and x(i) are solutions of the same differential 
equation, which admits a unique solution through each initial condition. This 
proves ( TT7|) . □ 

Similarly, if z(-) is the solution of the replicator dynamics in game (|9]) with 
initial condition z(0) = x(0), and f(t) is the rescaled time 

r(t) = [ fi(s)ds (18) 
Jo 

then 

Vt G R, x(t) = z(f(t)) (19) 
We are now ready to prove the main result of this section: 

Proposition 3.6. For any interior initial condition x = x(0) such that neither 
x\ = X2 = £3 nor 25 = xq = X7, the solution of the replicator dynamics 
converges to T^f. In particular, all pure strategies in the support of the unique 
equilibrium of game [W\! are eliminated. 



6 The top-left and bottom-right RPS games of (TT^j) need not be the the same for the 
results to hold, this is just to minimize the number of parameters. 
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Proof. The assumptions imply that x(0) and x(0) are well defined and different 
from (~, |, |). We must show that x(t) converges to r 567 . 

If A(t) — > 0, then x(t) converges to the face A = 0. Since on this face 
the payoff of strategy 4 is strictly smaller than the payoff of n 56 7, standard, 
domination-like arguments imply that x 4 (t) — > 0, hence — > 1 (see, e.g., 
Samuelson and Zhang, 1992). Due to f JT9|) and Proposition 13. 1[ this implies 
that x(t) — > r 567 and we are done. Thus it suffices to show that A(t) converges 
to zero. 

Assume by contradiction that this is not the case. 
Claim 3.7. Recall that f(t) = f \(s)ds. We have: f(t) — > +oo as t — >■ +oo. 
Proof. \(t) ^0 and A is clearly Lipschitz. □ 
Claim 3.8. lim sup t ^ +00 J" Q * [4x 3 (s) — 3A(s)] ds = +oo 

Proof. By ffTTj) . definition of f , and a change of variable, 

,f(t) 



/ A(s)x(s) ds = / f(s)y(f(s)) ds = / y(s) <is. 
Jo io Jo 



Therefore 



(4x 3 (s) - 3A(s)) ds = / A(s)(4x 3 (s) - 3) ds = / {Ay 3 {s)-3)ds 
o io io 

Since y(0) + (1/3,1/3,1/3), the result follows from Claim Ml and Eq. (fTT|). 

□ 

Claim 3.9. limsupyu(t) > 

t->+oo 1 + £ 

Proof. Using ( 1REPI) . an easy computation shows that 

— In (^) = Ax 3 + 10x 4 - 2A - Ax ■ Ux - e/i > 4x 3 + 10x 4 - 2A - e/ji (20) 
dt \ A / 
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where the inequality follows from Lemma 13.21 

Assume by contradiction that Hmsup t _ ) . +00 //(£) < •jtj- Thus, omitting time 
arguments, there exists a time T such that for all t > T, (l+e)/i < 1 = fi+X+x^ 
hence eli < A + £4. Together with ( 1201) . this implies that for t >T: 

4- ln (^f) > 4 ^3 + 9x 4 - 3A > 4x 3 - 3A (21) 
at V A / 



By Claim [378| it follows that lim sup In (3:4/ A) = +oo. Thus, there exists a time 
T' > T such that 24 > A. Due to the first inequality in (|21[) . for t > T', x 4 
remains greater than A and 4ln(y-) > 9x4 — 3A > 6A. By Claim 13771 this 
implies that X4/A — > +00 hence A — >• 0, a contradiction. □ 

We now conclude. Recall the definition of f in f fT8|) . A corollary of Claim 
13.91 is that f(t) —¥ +00 as t — > +00. By ( fl9l) and Proposition 13.11 it follows 
that x converges to the heteroclinic cycle of game ([9]). It is easy to check that 
along this cycle, the mean payoff is always greater than — ~. Therefore: 

3Ti > 0, Vt > T h x(t) • Ux(t) > -~ - £ (22) 

Moreover, fIREPI) and a somewhat tedious computation show that: 

j t ln (J^j = ii (± - Ux + ^ - ej - A. ^ + x • Uxj + 20x 4 (23) 

Assuming e < ^, ( 1221 . ( 1231) and Lemma 13721 imply that for t>T\\ 

U4^)>-~- (24) 
dt \XJ ~ 24 3 v ; 

It follows from Claim 13.91 that there exists a time T 2 > T\ at which the ratio 

li/X is greater than 16. By (1241) . this ratio then keeps increasing hence, by (1241) 

again, 

™ d , /u\ . . 16A A A(t) . . 

By Claim [377] this implies that A goes to zero, a final contradiction. □ 

19 



Perturbation of payoffs. As for game ([H]), any game sufficiently close to 
game f[T2"j) in the payoff space has a unique Nash equilibrium, and its support is 
{1, 2, 3} x {f , 2, 3}. We conjecture that the result of Proposition ^. 61 generalizes 
to such nearby games. That is, for almost all initial conditions, the solution 
of the replicator dynamics converges to the boundary of the face spanned by 
e 5 , e 6 and e 7 , hence all pure strategies in the support of the unique Nash 
equilibrium are eliminated. Our proof does not go through however, because 
Lemma [3.51 requires a very specific payoff structure. 

Correlated equilibrium. By contrast with the games of Viossat (2007, 
2008), the Nash equilibrium of games ([6]) and (|T2|) is not the unique correlated 
equilibrium. Whether reasonable dynamics may eliminate all strategies used 
in correlated equilibrium for almost all initial conditions is an open question. 

Other dynamics. A variant of Lemma 13.51 holds for the discrete-time 
replicator dynamics: 

Xi{n + 1) = xAn) — — — with C > — min(Ux)j (26) 
C + x ■ Ux i,x 

Thus, extending Proposition 13.61 to ( 1261) should be relatively simple. Proposi- 
tion 13.61 might also extend to some classes of payoff functionnal dynamics 



Xj : x% 



/([Ux],) - J>/ ([Ux] 
j 



(27) 



and / an increasing and sufficiently smooth function from IR to R. This might 
be hard to prove though, as Lemma 13.51 builds on linearity properties which 
are specific of the replicator dynamicsJl 



7 One reason to hope for a generalization is that, in Rock-Paper-Scissors games, close to 
the equilibrium, dynamics (|27[) behave as the replicator dynamics (Hofbauer and Sigmund, 
1998, exercice 8.1.1; Viossat, 2011, footnote 6). 



20 



Finally, there is a strong link between the best-reply dynamics and the 
time-average of the replicator dynamics (Gaunersdorfer and Hofbauer, 1995; 
Hofbauer et al., 2009). For this reason, we conjecture that Proposition 12.41 ex- 
tends to (IREPp ; that is, in game ([6]), for almost all initial conditions, all strate- 
gies in the support of the equilibrium are eliminated under (IREPp . What we 
can show, in the same spirit, is that Proposition 13.61 extends to the best-reply 
dynamics, up to replacement of the heteroclinic cycle r 567 by the corresponding 
Shapley triangle: 



Proposition 3.10. Assume that < e < 2/9. For any initial condition x 
such that neither x\ = x<i = X3 nor x$ = xq = x-j, all solutions of the best-reply 
dynamics converge to the Shapley triangle ST^-j. 

Proof. See Appendix [Bl Compared to Proposition 12.41 the added difficulty is 
to deal with initial conditions through which several solutions of (IBR| exist. 
This can be done due to a decomposition of the best-reply dynamics similar 
to Lemma [3.51 □ 

4 Discussion 

In game (Ti~2l . the Nash equilibrium is unique and quasi-strict, and therefore 
persistent, regular, hence strongly stable, essential, strictly proper, strictly per- 
fect, etc. (van Damme, 1991) Thus, from the traditional, rationalistic point of 
view, it may be seen as the unambiguous solution of the game. However, under 
two of the most studied dynamics, all strategies in the support of this Nash 
equilibrium are eliminated from almost all initial conditions. This indicates 



ST, 




x G 5*7 : x 5 + x 6 + x 7 = 1 and max (Ux)j = 
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an even wider gap between strategic and evolutionary considerations that had 
been noted before. 

We conjecture that elimination of all strategies in the support of Nash 
equilibria from almost all initial conditions occurs for many other dynamics, 
including multi-population dynamics. However, this might be hard to prove 
because this can only arise in relatively large games, in which having a precise 
understanding of dynamics more complex than the replicator dynamics or 
the best-reply dynamics might prove difficult. A way forward might be to 
consider nonlinear games and to replace, in the construction, Rock-Paper- 
Scissors games by hypnodisk games (Hofbauer and Sandholm, 2011). 

A Equilibrium uniqueness 

In this section, we show that games ([6]) and (fT2j) have a unique equilibrium. 
We begin with a lemma used in both proofs. 

Consider a symmetric bimatrix game with pure strategy set / = {1,2,..., N} 
and payoff matrix U. Let I' C /. For any x in Sn, define x' G by x\ = Xi 
if i G /' and x\ = otherwise. Let x(I') = J2iei> x i- 

Lemma A.l. Let (x, y) be a Nash equilibrium such that x(I')y(I') > 0. As- 
sume that against x — x' and y — y 1 , the payoffs of a strategy i in I' is inde- 
pendent of i. That is, for all i and j in V , 

[U(y - y 1 )], = [U(y - y% and [U(x - x% = [U(x - x% (28) 

Then (x', y 1 ) induces an unnormalized Nash equilibrium of the game restricted 
to I' x I' . That is, for all i, j in V : 

x\ > (Uj/)i > (Uj/),- and y\ > (Ux') t > (Ux') 3 (29) 
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Proof. Let i e /'. If a;^ > then Xj > 0, hence strategy i is a best-reply 
to y. Together with (1251) this implies that for all j in J', (Uy')j — (Uy')j = 
(Uy)j — (Uy)j > 0. This proves the first part of (1291) . The second part is 
symmetric. □ 

Proof of proposition 12.31 Let (x, y) be a Nash equilibrium of ([6]). We want 
to show that x = y = n 12 3 = (1/3, 1/3, 1/3, 0, 0, 0). 

Step 1. = and by symmetry 2/42/52/6 = 0. 

Indeed, if x 4 x 5 x 6 > 0, then strategies 4, 5 and 6 are all best replies to y, 
hence so is n 456 . This cannot be because, as is easily checked, n 456 is strictly 
dominated by 11123. 

Step 2. yi + i/2 + JJ3 > and by symmetry x\ + x 2 + £3 > 0. 
Assume by contradiction that 2/1 = 2/2 = 2/3 = 0. It follows that 

Vie{l,2,3},(Uy)i = -l<0 (30) 

Furthermore, due to Step 1, y has support in {4,5}, {5,6} or {4,6}. In any 
case, there exists i in {4,5,6} such that (Uy)j > 0. Together with (13"U1) . this 
implies that strategies 1, 2 and 3 are not best replies to y, hence X\ = x 2 = 
23 = 0. Thus, both x and y have support in {4,5,6}, hence (x, y) induces a 
Nash equilibrium of the game restricted to {4,5,6} x {4,5,6}. This implies 
that x = y = n 456 , which contradicts Step 1. 

Step 3. x x =x 2 = £3 and 1/1 = 2/2 = 2/3- 
Let x 12 3 = (xi, x%, £ 3 , 0, 0, 0) and x 456 = x — x 12 3 = (0, 0, 0, x 4 , x 5 , x 6 ). Define 
y 123 and y 456 symmetrically. For every i and j in {1, 2, 3}, we have (Ux 456 )j = 
(Ux 456 )j and (Uy 456 )j = (Uy 456 )j. Therefore, if follows from Step 2 and from 
Lemma lA.ll applied with /' = {1,2,3} that (xi 2 3,y 123 ) is an unnormalized 
Nash equilibrium of the game restricted to {1,2,3} x {1,2,3}. Therefore x 
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and y are both proportional to n 123 . 

Step 4- x± + x 5 + x 6 = and by symmetry 2/4 + y 5 + y 6 = 0. 
Assume by contradiction that £4 + £5 + xq > 0. Against 11123, every strategy 
i in {4, 5, 6} earns the same payoff: —5/3. Thus, by Step 3, for every i and j 
in {4,5,6}, we have (Uxi 23 )i = (Uxi 23 )j and (Uy 123 ) 4 = (Uy 123 ) j . Together 
with lemma IATT1 with /' = {4, 5, 6}, this implies that if 2/4 + 2/5 + 2/6 > then x 456 
and y 456 are proportional to n 456 , hence x^x^Xq > 0. This cannot be due to 
Step 1. Therefore, 2/4 + 2/5 + 2/6 = 0. But then, by Step 3, y = 11123. Therefore, 
strategies 4, 5 and 6 are not best replies to y. Therefore x 4 = £5 = Xq = 0. 

Proposition 12.41 now follows from Steps 3 and 4. ■ 



Proof of proposition 13.31 Recall the definition of n 123 and n 567 : 

ni 23 = i ^ 0, 0, 0, 0) ; n 567 = (0, 0, 0, 0, i i ^ . 

Let (x, y) be a Nash equilibrium of (|12p . Consider the conditions: 

xi + x 2 + x 3 > and y\ + y 2 + 2/3 > (31) 

x 5 + x 6 + x 7 > and y 5 + y 6 + y 7 > (32) 
Note that, due to Lemma [A. 11 

Lemma A. 2. // (T3~T1) holds, then x\ = x 2 = x 3 and 2/1=2/2 = 2/3- V dS2D 
/ioWs, t/ien X5 = X6 = £7 and 2/5 = 2/6 = 2/7- 

Now examines 4 cases, depending on whether (1311) and (1321) hold or not: 

Case 1. If (I3T1) holds. Then, by lemma [A72l 2/1 = 2/2 = 2/3- Therefore 

n 567 ■ Uy > (Uy) 4 , hence x 4 = 0. By symmetry, 2/4 = 0. 

Subcase 1.1. If furthermore ( 1321) holds. Then by lemma lA~2l 2/5=2/6 = 2/7- 

Since 2/4 = and 2/1 = 2/2 = 2/3; it follows that y is a convex combination of 
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n 12 3 and n 567 . Against both of these strategies, the payoff of n 12 3 is strictly 
greater than the payoff of strategies 5, 6 and 7. Thus, the latter cannot be 
best-replies to y, hence x$ + xq + x-j = 0. This contradicts (1321) . 

Subcase 1.2. If (1321) does not hold. Without loss of generality, assume that 
1/5 + 1/6 + 1/7 = 0. Since y 4 = and y 1 = y 2 = y 3 , this implies that y = n 123 . 
Therefore, as above, none of the strategies 5, 6 and 7 is a best reply to y. 
Therefore x$ + Xq + x-j = which by the same argument implies x = n 12 3. 
Therefore, x = y = ni 2 3. 

Case 2. If (T5TT) does not hold. Without loss of generality, assume x\ + x 2 + 
£3 = 0. This implies that n 567 is a strictly better response to x than strategy 
4. Thus, ?/ 4 = 0. 

Subcase 2.1. If furthermore holds. Then y is a convex combination 
of n 567 and strategies 1, 2, 3. This implies that ni 23 is a strictly better response 
to y than either 5, 6 or 7. Therefore, £5 = xq = x-j = 0, contradicting fl32l) . 

Subcase 2.2. If (|32l) does not hold. Then x^+xq+x-j = or 1/5+1/6+I/7 = 0. 
In the latter case, since y± — 0, it follows that y has support in {1, 2, 3}, hence 
that ni 2 3 is a strictly better response to y than either 5, 6, or 7; therefore, in 
any case, x 5 + x Q + x 7 = 0. Since we assumed x\ + x 2 + x 3 = 0, it follows that 
x = e 4 . Therefore, y must have support in {5, 6, 7}. It follows that x is not a 
best-reply to y, a contradiction. 

Summing up, only subcase 1.2 is possible, and then x = y = ni 23 . ■ 



B Best-reply dynamics in the 7x7 game (1121) 



This section proves Proposition 13.101 Recall the notation of Section [3J A, /i, 
x, x, U and U. Consider a solution of f ]BRj) in game f|T2|) such that initially 
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neither x\ = x 2 = x 3 nor x 5 = x 6 = x 7 . Thus, A(0) > 0, /x(0) > 0, x(0) ^ 
(1/3, 1/3, 1/3) and x(0) ^ (1/3, 1/3, 1/3). This implies that \(t) and /i(t) are 
positive for all t > 0, as they can decrease at most exponentially. 

We first show that, up to a change of velocity, x and x follow the best- 
reply dynamics in the RPS game Below, BR(-) denotes the best-reply 
correspondence in game (Q. 

Lemma B.l. For almost all times t: 

x G ^1 + (BR(5c) - x) and £ G ^1 + ^ (BR(x) - x) 

Proof. We prove the first part. The proof of the second part is the same. Let 
b G BR(x(t)) such that x(t) = b - x(t). 

Case 1: if hi = for all i = 1,2,3. Then Xi = —Xi for all i = 1,2,3. This 
implies that x = and that A = — A, so that the result holds trivially. 
Case 2. Otherwise, define b as x. A few lines of algebra show that, indepen- 
dently of the payoffs: 

(33) 




Moreover, since all strategies in {1,2,3} earn the same payoffs against strate- 
gies in {4, 5, 6, 7}, a variant of Lemma [A. II shows that b G BR(x), hence the 
result. □ 

Recall the definition of the Shapley triangle in ([5]). 

Lemma B.2. If X(t) (resp. fi(t)) does not converge to 0, then the limit set 
of x(t) (resp. x(t)j is the Shapley triangle (TJJ) hence maxj(tjx)j — > (resp. 
maxj(Ux)j — > 0). 
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Proof. We only prove the first part (with A (t) ) . The proof of the second part is 
the same. Let y(-) be the unique solution of the best-reply dynamics in game 
fl4| with initial condition y(0) = x(0). Let r{t) denote the rescaled time: 



A 



r(t):= I [ 1 + ^)1 ^ = t + ln (x|y) ( 34 ) 



o 



Note that r(t) is nondecreasing as, due to (IBRj) . A > —A. Moreover limsup A(t) > 



0, hence r(t) — > +oo by ( J34l) . Furthermore, it follows from (IB. II) that for all 
t > 0, x(t) = y(r(t)). The result now follows from Proposition 12. II □ 

Lemma B.3. There exists a time T > such that none of the strategies 1, 2 
and 3 is a best-reply to x(T). 

Proof. Assume by contradiction that for all t > 0, 

(?) (Ux) 4 — max(Ux)j < and (ii) max(Ux)j — max(Ux)j < (35) 

l<i<3 5<i<7 1<*<3 

Note that the payoff of a strategy i in {1, 2, 3} may be written as 

(Ux) ; = A(Ux), - 10x4 + //(-1/3 + e). (36) 

Similarly, for all j in {5, 6, 7}, 

(Ux)j = -A/3 + 10x A + Ai(Ux)j/, with f = j + 4. (37) 

Note also that since e 4 is not a best-reply to itself, x A (t) cannot converge to 1. 
Now examine the following cases. 

Case 1: if \{t) — > 0. Then /i(t) does not converge to 0, thus it follows from 
A(t) -> 0, (EZD and Lemma El that 

lim sup max (Ux)j = lim sup ( IOX4 + ^ max(Ux) j ) > 

5<i<7 V i / 
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while limsupmaxi< i < 3 (Ux) i < — 1/3 + e < 0. This contradicts (ii) in (13*51) . 

Case if \(t) does not converge to 0. Then by Lemma IB .21 x converges to 
the Shapley triangle (~~~) and there is a increasing sequence (t n ) with t n — > +oo 
such that x(£ n ) — > q, where q is the vertex of the Shapley triangle defined in 

(HDD. 

Subcase 2.1. If n{t) — >■ 0. Together with x(t n ) — > q, this implies that for n 
large enough, strategy 4 is a strictly better reply to x(t n ) than strategies 1,2 
and 3; this contradicts part (i) of ( 1331) . 

Subcase 2.2. If /j,(t) does not converge to 0. Then by Lemma 1B.2I and (jSJ), 
maxj(Ux)j — > and maxj(Ux)j — > 0. Together with (136|) . ( 13T|) . their analog 
for strategy 4 and ( 135|) . this implies that along the sequence (t n ): 

(Ux) 4 - max(Ux), = A [(Uq) 4 - o(l)] + 10x 4 - < (38) 

l<i<3 

where q = (qi, q 2 , q~z, 0, 0, 0, 0), and 

max(Ux)j - max(Ux), = -~ + 20x 4 + ^[\-e\+ oil) < (39) 

5<i<7 1<«<3 3 V 3 / 

Roughly, ( )38|) implies that A//x should be small and (13*9*]) that A//i should be 
large. Assuming conservatively that x 4 = 0, hence \i = 1 — A, these equations 
may be shown to be incompatible for e < 2/9. □ 

We now conclude. By Lemma IB.31 there exists a time T such that none 
of the strategies 1, 2 and 3 is a best- reply to x(T). Moreover, due to Lemma 
IB.H for all t > 0, x(t) 7^ (1/3, 1/3, 1/3), hence by a variant of Lemma [A. 11 
strategies 5, 6 and 7 cannot all be best-replies to x(i). Thus, due to the cyclic 
symmetry of strategies 5 to 7, we may assume that the set of pure best-replies 
to x(T) is one of the followings: 

Case 1: {5} or {5,6} ; Case 2: {4,6} or {4,5,6} ; or Case 3: {4} 
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In Case 1, the same arguments as in Proposition 12.41 show that x(i) — >■ 
ST 567 . In Case 2, since strategy 4 is strictly dominated by strategy 6 on the 
face spanned by e 4 , e 5 and e 6 , it immediately ceases to be a best-reply. This 
leads to Case 1. In Case 3, since e 4 is not a best-reply to itself, there exists 
a first time T' > T at which e 4 is not the unique best-reply to x(t). Due to 
the improvement principle (Lemma \2.2\i , none of the strategies 1, 2 and 3 is a 
best-reply to x(T'). Thus we are back to Case 2. This concludes the proof. 
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